Subspace Embeddings and ℓp-Regression Using Exponential Random Variables
نویسندگان
چکیده
Oblivious low-distortion subspace embeddings are a crucial building block for numerical linear algebra problems. We show for any real p, 1 ≤ p <∞, given a matrix M ∈ Rn×d with n d, with constant probability we can choose a matrix Π with max(1, n1−2/p)poly(d) rows and n columns so that simultaneously for all x ∈ R, ‖Mx‖p ≤ ‖ΠMx‖∞ ≤ poly(d)‖Mx‖p. Importantly, ΠM can be computed in the optimal O(nnz(M)) time, where nnz(M) is the number of non-zero entries of M . This generalizes all previous oblivious subspace embeddings which required p ∈ [1, 2] due to their use of p-stable random variables. Using our matrices Π, we also improve the best known distortion of oblivious subspace embeddings of `1 into `1 with Õ(d) target dimension in O(nnz(M)) time from Õ(d) to Õ(d), which can further be improved to Õ(d) log n if d = Ω(log n), answering a question of Meng and Mahoney (STOC, 2013). We apply our results to `p-regression, obtaining a (1+ )-approximation inO(nnz(M) log n)+poly(d/ ) time, improving the best known poly(d/ ) factors for every p ∈ [1,∞) \ {2}. If one is just interested in a poly(d) rather than a (1 + )-approximation to `p-regression, a corollary of our results is that for all p ∈ [1,∞) we can solve the `p-regression problem without using general convex programming, that is, since our subspace embeds into `∞ it suffices to solve a linear programming problem. Finally, we give the first protocols for the distributed `p-regression problem for every p ≥ 1 which are nearly optimal in communication and computation.
منابع مشابه
Subspace Embeddings and \(\ell_p\)-Regression Using Exponential Random Variables
Oblivious low-distortion subspace embeddings are a crucial building block for numerical linear algebra problems. We show for any real p, 1 ≤ p < ∞, given a matrix M ∈ R with n ≫ d, with constant probability we can choose a matrix Π with max(1, n)poly(d) rows and n columns so that simultaneously for all x ∈ R, ‖Mx‖p ≤ ‖ΠMx‖∞ ≤ poly(d)‖Mx‖p. Importantly, ΠM can be computed in the optimal O(nnz(M)...
متن کاملSubspace Embeddings for the Polynomial Kernel
Sketching is a powerful dimensionality reduction tool for accelerating statistical learning algorithms. However, its applicability has been limited to a certain extent since the crucial ingredient, the so-called oblivious subspace embedding, can only be applied to data spaces with an explicit representation as the column span or row span of a matrix, while in many settings learning is done in a...
متن کاملExperimental study for the comparison of classifier combination methods
In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) varianc...
متن کاملA Random Walk with Exponential Travel Times
Consider the random walk among N places with N(N - 1)/2 transports. We attach an exponential random variable Xij to each transport between places Pi and Pj and take these random variables mutually independent. If transports are possible or impossible independently with probability p and 1-p, respectively, then we give a lower bound for the distribution function of the smallest path at point log...
متن کاملA Bernstein-type Inequality for Suprema of Random Processes with Applications to Model Selection in Non-gaussian Regression
Let (Xt)t∈T be a family of real-valued centered random variables indexed by a countable set T . In the first part of this paper, we establish exponential bounds for the deviation probabilities of the supremum Z = supt∈T Xt by using the generic chaining device introduced in Talagrand (1995). Compared to concentration-type inequalities, these bounds offer the advantage to hold under weaker condit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1305.5580 شماره
صفحات -
تاریخ انتشار 2011